Section One:¶
The coronavirus disease (COVID-19) is a global pandemic disease caused by the SARS-CoV-2 virus. Globally, over 400 million confirmed cases since the first reported case in November 2019, while the mortality exceeded 5 million death tolls. The Covid-19 is an infectious disease that spreads fast from an infected person through their mouth and nose. Most of the cases are mild to moderate; however, people who developed severe symptoms or death are older or have medical issues.
The main diagnosis methods (approaches) to detect the virus are Reverse transcription-polymerase chain reaction (RT-PCR), Computed Tomography (CT) and Chest X-ray, in addition to the pneumonia symptoms. Medical centres hold an enormous amount of images database that have been collected through X-ray and CT. Therefore, the fast and early detection of the disease can support in controlling the COVID-19 spread.
On the other hand, computer vision is a trending concept that plays a significant role in daily life, from online shopping to assisting doctors and medical providers in diagnosing. Computer vision algorithms benefit from the Convolutional neural network (CNN), a type of deep learning that contributes to automating the image classification, which plays a role in early detecting the disease. The network inputs are images, while the output is to classify image features. Many companies such as Google, Microsoft and many others benefit from using CNN and working toward novel designs.
The significant impact of the COVID-19 pandemic worldwide, specifically in health sectors, affects people's health which appears in mortality and morbidity records. Therefore, COVID-19 attracts the interest and attention of many researchers and publishers. Below are two studies that used X-ray images in the classification of COVID-19 using CNN.
Asmaa Abbas and others in 2021 classified COVID-19 based on X-ray images by adapting a CNN architecture called Decompose, Transfer, and Compose (DeTraC). The main goal was to overcome the irregularities issue in annotated data. DeTraC was validated with various pre-trained models in ImageNet such as VGG19, ResNet, GoogleNet, etc. As a result, it achieved significantly high accuracy in detecting COVID-19 from X-ray images with VGG19 around 98%. In contrast, DeTraC's accuracy was 93.1% with 100% sensitivity.
Another study was conducted to automate the detection of COVID-19 from X-ray images using CNN. This paper aimed to distinguish the COVID-19 patients from healthy, viral and bacterial pneumonia. The deep transfer learning approach was utilized with nine different pre-trained models, such as GoogleNet, ResNet-50, Se-ResNeXt-50, Inception V4, etc. The Se-ResNeXt-50 achieved the highest classification accuracy for binary class and multi-class with 99.3% and 97.6%, respectively.
The analysis has been conducted with an attempt to detect covid-19 cases based on the extraction of Covid-19 graphical features to classify the Chest X-ray images. This problem is a single label binary classification (normal/covid) and will be detailed further when describing the dataset.
The main objective of this coursework is to enhance a model that is more accurate than the baseline model. More specifically, building a network that will be trained on a portion of the dataset (training set), validated on the validation set, then used the tuned hyperparameters with the optimum epoch to train on the whole training set and the model accordingly will be able to predict unseen dataset (test set).
Below hypotheses are considered while developing the NN model:
The notebook will consist of six main sections as follows. The 1st section focuses on topic introduction and model architecture & methodology. The 2nd section covers the preprocessing and modelling (from scratch) phases from baseline to enhanced model. The 3rd section applies a pre-trained convent model. Then the following section utilises the previous enhanced model on the unseen dataset (test data). Following that, general convolutional visualisations are illustrated. Finally, the last section covers the results, conclusions, references and appendix.
A Convolution Neural Network (CNN) is applied to the dataset to classify the x-ray images of a particular patient into normal or covid.
The first step was to import the necessary libraries used throughout the modelling phases. Following that, Covid and normal x-ray images were imported, and a balanced sample dataset was taken from the original dataset to ease the modelling process with a small dataset. This sample dataset for each class (Covid/ normal) was split into train, validation, and test sets equally each consists of 1000, 500 and 500 samples.
In the next step, pre-processing, rescaling and resizing for the images were done, which helped save time during the training process. The images were rescaled from 0 to 299 pixels into the scale between 0 and 1. In addition, image-resizing was applied from 299 by 299 to the target size 150 by 150 pixels.
A baseline model was initialized as a benchmark to be beaten by the final and enhanced model. Then a statistical power model was expanded and trained gradually by increasing the batch size, neurons and filters until it overfitted – resulted in a model with high accuracy but failed to be generalized on a new dataset, and an optimum epoch was generated from the overfitted model.
Many methods were applied to mitigate the overfitting, such as using callbacks to early stop the model when there is no improvement. Moreover, reducing the network size, data augmentation, adding dropout, and weight regularization was applied and compared to the overfitted model. Furthermore, hyperparameters tuning was applied using various techniques by tuning the batch size, number of filters, activation function, optimizer, filter window size, number of layers, and padding.
Nevertheless, depthwise separable convolution and batch normalization were deployed as advanced techniques. Then a pre-trained MiniVGGNet model for grayscale images was implemented using different approaches. Finally, the model was adjusted using the best-tunned parameters with an optimum epoch, trained in the training dataset and then tested on the test set and accuracies for baseline, trained and predicted model were compared, then prediction was generated on the final model.
Below are the factors that have been considered when building and tunning the neural networks model:
# !pip install tensorflow
# !pip install opencv-python
import os, shutil
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix,classification_report
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import RandomizedSearchCV
from collections import Counter
import tensorflow as tf
from tensorflow import keras
from keras import optimizers
from keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from keras import layers
from keras import models
from keras.layers import Input, Lambda, Dense, Flatten
from keras.models import Model
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
from keras.models import Sequential
from keras import regularizers
from keras.utils.vis_utils import plot_model
print(tf.__version__, ' ', tf.keras.__version__)
2.8.0 2.8.0
The X-ray images database was obtained from Kaggle called COVID-19 Radiography Database. A team of researchers developed this database in addition to other collaborators. It was collected from various sources such as the Italian Society of Medical and Interventional Radiology (SIRM) COVID-19 Database, over 40 publications, Chest X-Ray Images (pneumonia) database and Novel Corona Virus 2019 Dataset.
The dataset is publicly available to be used for academic purposes and can be accessed from the following link (here)
The database consists of four classes COVID, Lung_Opacity, Normal and Viral Pneumonia. Each consists of 3616, 6012, 10.2k and 1345 files. The COVID and Normal files were used in this coursework, with just 2000 images in each category. Then to obtain balanced classes, both Normal and COVID were split into 1000 train, 500 validation and 500 test sets.
The below code was run once when the file loaded, then the following active code will be run instead to read the datasets
# # original directory path
# original_dataset_dir = '/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal'
# # directory path for the new small dataset
# base_dir = '/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal_small'
# os.mkdir(base_dir)
# # splitted training, validation and test directories
# train_dir = os.path.join(base_dir, 'train')
# os.mkdir(train_dir)
# validation_dir = os.path.join(base_dir, 'validation')
# os.mkdir(validation_dir)
# test_dir = os.path.join(base_dir, 'test')
# os.mkdir(test_dir)
# # training covid x-ray directory
# train_covid_dir = os.path.join(train_dir, 'covid')
# os.mkdir(train_covid_dir)
# # training normal x-ray directory
# train_normal_dir = os.path.join(train_dir, 'normal')
# os.mkdir(train_normal_dir)
# # validation covid x-ray directory
# validation_covid_dir = os.path.join(validation_dir, 'covid')
# os.mkdir(validation_covid_dir)
# # validation normal x-ray directory
# validation_normal_dir = os.path.join(validation_dir, 'normal')
# os.mkdir(validation_normal_dir)
# # validation covid x-ray directory
# test_covid_dir = os.path.join(test_dir, 'covid')
# os.mkdir(test_covid_dir)
# # validation normal x-ray directory
# test_normal_dir = os.path.join(test_dir, 'normal')
# os.mkdir(test_normal_dir)
# # 1000 covid images will be copied to train_covid_dir
# fnames = ['COVID-{}.png'.format(i) for i in range(1, 1001)]
# for fname in fnames:
# src = os.path.join(original_dataset_dir, fname)
# dst = os.path.join(train_covid_dir, fname)
# shutil.copyfile(src, dst)
# # 500 covid images will be copied to validation_covid_dir
# fnames = ['COVID-{}.png'.format(i) for i in range(1001, 1501)]
# for fname in fnames:
# src = os.path.join(original_dataset_dir, fname)
# dst = os.path.join(validation_covid_dir, fname)
# shutil.copyfile(src, dst)
# # 500 covid images will be copied to test_covid_dir
# fnames = ['COVID-{}.png'.format(i) for i in range(1501, 2001)]
# for fname in fnames:
# src = os.path.join(original_dataset_dir, fname)
# dst = os.path.join(test_covid_dir, fname)
# shutil.copyfile(src, dst)
# # 1000 normal images will be copied to train_normal_dir
# fnames = ['Normal-{}.png'.format(i) for i in range(1, 1001)]
# for fname in fnames:
# src = os.path.join(original_dataset_dir, fname)
# dst = os.path.join(train_normal_dir, fname)
# shutil.copyfile(src, dst)
# # 500 normal images will be copied to validation_normal_dir
# fnames = ['Normal-{}.png'.format(i) for i in range(1001, 1501)]
# for fname in fnames:
# src = os.path.join(original_dataset_dir, fname)
# dst = os.path.join(validation_normal_dir, fname)
# shutil.copyfile(src, dst)
# # 500 normal images will be copied to test_normal_dir
# fnames = ['Normal-{}.png'.format(i) for i in range(1501, 2001)]
# for fname in fnames:
# src = os.path.join(original_dataset_dir, fname)
# dst = os.path.join(test_normal_dir, fname)
# shutil.copyfile(src, dst)
# small dataset directory
base_dir = '/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal_small'
# splitted training, validation and test directories
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
# training covid x-ray directory
train_covid_dir = os.path.join(train_dir, 'covid')
# training normal x-ray directory
train_normal_dir = os.path.join(train_dir, 'normal')
# validation covid x-ray directory
validation_covid_dir = os.path.join(validation_dir, 'covid')
# validation normal x-ray directory
validation_normal_dir = os.path.join(validation_dir, 'normal')
# validation covid x-ray directory
test_covid_dir = os.path.join(test_dir, 'covid')
# validation normal x-ray directory
test_normal_dir = os.path.join(test_dir, 'normal')
# plot the images count as a bar chart
train_counter = Counter(train_generator.classes)
train_covid = train_counter[0]
train_normal = train_counter[1]
validation_counter = Counter(validation_generator.classes)
validation_covid = validation_counter[0]
validation_normal = validation_counter[1]
test_counter = Counter(test_generator.classes)
test_covid = test_counter[0]
test_normal = test_counter[1]
df = pd.DataFrame([[train_covid, train_normal],
[validation_covid, validation_normal],
[test_covid, test_normal]],
['Train', 'Validation', 'Test'],
columns = ['Covid', 'Normal'])
df.plot(kind = 'bar')
<AxesSubplot:>
train_generator.class_indices
{'covid': 0, 'normal': 1}
Rescale the pixel values (between 0 and 255) to the [0, 1] interval (as you know, neural networks prefer to deal with small input values), also set the target size to 150 by 150. Resizing and scaling were applied on images to save time in training.
ImageDataGenerator was used to read images from directories
# reading the dataset
target_size = (150, 150)
batch_size = 20
train_datagen = ImageDataGenerator(rescale = 1./299)
test_datagen = ImageDataGenerator(rescale = 1./299)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size = target_size,
batch_size = batch_size,
class_mode='binary',
color_mode = 'grayscale')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size = target_size,
batch_size = batch_size,
class_mode = 'binary',
# the used images are black and white
color_mode = 'grayscale')
test_generator = test_datagen.flow_from_directory(
test_dir,
target_size = target_size,
batch_size = batch_size,
class_mode = 'binary',
# the used images are black and white
color_mode = 'grayscale')
Found 2000 images belonging to 2 classes. Found 1000 images belonging to 2 classes. Found 1000 images belonging to 2 classes.
# define a model function with units parameters to be easily called every time
def build_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = ['accuracy'])
return model
# model.summary()
def model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = build_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# define the plot function with its various parameters
# this function will be used throughout model development stages to monitor the validation loss values
def plot1(epochs, y1, style_1, label_1, y2, style_2, label_2, xlabel, ylabel, title):
epochs = range(1, epoch_no + 1)
plt.clf()
plt.plot(epochs, y1, style_1, label = label_1)
plt.plot(epochs, y2, style_2, label = label_2)
plt.xlabel(xlabel)
plt.ylabel(ylabel)
plt.title(title)
plt.legend()
plt.show()
# this plot function will be used when accuracy and loss compared together
def plot2(epoch_no, x1, x2, y1, y2):
plt.clf()
plt.figure(figsize = (8, 4))
plt.subplot(1, 2, 1)
plt.plot(range(epoch_no), x1, label = 'Training Accuracy')
plt.plot(range(epoch_no), x2, label = 'Validation Accuracy')
plt.legend(loc = 'lower right')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(range(epoch_no), y1, label = 'Training Loss')
plt.plot(range(epoch_no), y2, label = 'Validation Loss')
plt.legend(loc = 'upper right')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.show()
The baseline model is used to set as a benchmark model in regard to the other developed models. The main target is to develop a model that can beat the baseline model.
There are two approaches to initialize a bseline mode, either by common sense or by using basic machine learning. To choose the best one, it will depends on the task and if the dataset is balanced or not. In this case the two approaches will be applied and disscused to see the suitable one.
After downloading the dataset, a balanced subset was created from the original one.
Each set (train, validation and test) was divided equally to normal and covid. This balance will lead to get an accurate a common sense baseline.
print('total training covid xrays:', len(os.listdir(train_covid_dir)))
total training covid xrays: 1000
print('total training normal xrays:', len(os.listdir(train_normal_dir)))
total training normal xrays: 1000
print('total validation covid xrays:', len(os.listdir(validation_covid_dir)))
total validation covid xrays: 500
print('total validation normal xrays:', len(os.listdir(validation_normal_dir)))
total validation normal xrays: 500
print('total test covid xrays:', len(os.listdir(test_covid_dir)))
total test covid xrays: 500
print('total test normal xrays:', len(os.listdir(test_normal_dir)))
total test normal xrays: 500
This is a balanced binary classification problem
with a common sense baseline prediction accuracy of 50%
A simple baseline model was designed with one hidden layer and a sigmoid function was used throughout the modelling process (as mentioned in the model architecture). The basic baseline model was trained using train set and validated in the validation set and then it tested using the test set (unseen data)
The number of channels which is mentioned as a parameter in the input_shape is equal 1, that means the used x-ray images is black and white (grayscale)
model = Sequential()
model.add(layers.Flatten(input_shape = (150, 150, 1)))
model.add(layers.Dense(32, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))
# compiling the netwrok
model.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = 'accuracy')
# model.summary()
# set the function arguments
batch_size = 20
epoch_no = 10
history = model.fit(train_generator,
steps_per_epoch = len(train_generator.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
baseline_loss, baseline_acc = model.evaluate(test_generator, verbose = 0)
print('Baseline Model Accuracy:', '%.f'% (baseline_acc*100), '%')
Baseline Model Accuracy: 51 %
The baseline model accuracy for the both approaches is 51%. The following phase is to build a model that beats the baseline model.
Here, a small and underfitted model is built to beat the defined baseline, this model will be gradually enhanced untill it is overfitted
Generating a basic model that has a higher accuracy than the baseline model by adding convolutional layers
# set the model parameters
filter1 = 4
filter2 = 8
neuron1 = 16
# create an empty network
model = Sequential()
# adding layers
model.add(layers.Conv2D(filter1, (3, 3), activation = 'relu', input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D(pool_size = 2))
model.add(layers.Conv2D(filter2, (3, 3), activation = 'relu'))
model.add(layers.MaxPooling2D(pool_size = 2))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))
# compiling the netwrok
model.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['accuracy'])
# model.summary()
# set the function arguments
batch_size = 20
epoch_no = 10
history = model.fit(
train_generator,
steps_per_epoch = len(train_generator.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
model1_train_acc = history.history['accuracy']
model1_val_acc = history.history['val_accuracy']
model1_train_loss = history.history['loss']
model1_val_loss = history.history['val_loss']
for data_batch, labels_batch in train_generator:
print('data batch shape:', data_batch.shape)
print('labels batch shape:', labels_batch.shape)
break
data batch shape: (20, 150, 150, 1) labels batch shape: (20,)
plot2(epoch_no, model1_train_acc, model1_val_acc, model1_train_loss, model1_val_loss)
<Figure size 432x288 with 0 Axes>
# get the maximum accuracy and minimum loss
model1_loss = round(min(model1_val_loss), 3)
model1_accuracy = round(max(model1_val_acc), 3)
model1_accuracy
0.923
print('Baseline Model Score:', '%.f'% (baseline_acc*100), '%')
print('Model 1 Score:', '%.f'% (model1_accuracy*100), '%')
Baseline Model Score: 51 % Model 1 Score: 92 %
# set the model parameters
filter1 = 16
filter2 = 32
neuron1 = 64
# create an empty network
model = Sequential()
# adding layers
model.add(layers.Conv2D(filter1, (3, 3), activation = 'relu', input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D(pool_size = 2))
model.add(layers.Conv2D(filter2, (3, 3), activation = 'relu'))
model.add(layers.MaxPooling2D(pool_size = 2))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))
# compiling the netwrok
model.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['accuracy'])
# model.summary()
# set the function arguments
batch_size = 128
epoch_no = 20
history = model.fit(
train_generator,
steps_per_epoch = len(train_generator.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
model2_train_acc = history.history['accuracy']
model2_val_acc = history.history['val_accuracy']
model2_train_loss = history.history['loss']
model2_val_loss = history.history['val_loss']
plot2(epoch_no, model2_train_acc, model2_val_acc, model2_train_loss, model2_val_loss)
<Figure size 432x288 with 0 Axes>
# get the maximum accuracy and minimum loss
model2_loss = round(min(model2_val_loss), 3)
model2_accuracy = round(np.max(model2_val_acc), 3)
model2_accuracy
0.936
print('Baseline Model Score:', '%.f'% (baseline_acc*100), '%')
print('Model 1 Score:', '%.f'% (model1_accuracy*100), '%')
print('Model 2 Score:', '%.f'% (model2_accuracy*100), '%')
Baseline Model Score: 51 % Model 1 Score: 92 % Model 2 Score: 94 %
# set the model parameters
batch_size = 20
epoch_no = 50
# default train generator without data augmentation
train_generator_type = train_generator
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
model3_train_acc = results_train_acc
model3_val_acc = results_val_acc
model3_train_loss = results_train_loss
model3_val_loss = results_val_loss
plot2(epoch_no, model3_train_acc, model3_val_acc, model3_train_loss, model3_val_loss)
<Figure size 432x288 with 0 Axes>
# get the maximum accuracy and minimum loss
model3_loss = round(min(model3_val_loss), 3)
model3_accuracy = round(np.max(model3_val_acc), 3)
model3_accuracy
0.949
print('Baseline Model Score:', '%.f'% (baseline_acc*100), '%')
print('Model 1 Score:', '%.2f'% (model1_accuracy*100), '%')
print('Model 2 Score:', '%.2f'% (model2_accuracy*100), '%')
print('Model 3 Score:', '%.2f'% (model3_accuracy*100), '%')
Baseline Model Score: 51 % Model 1 Score: 92.30 % Model 2 Score: 93.60 % Model 3 Score: 94.90 %
# summarize the results in a dataframe
df_summary = pd.DataFrame([[model1_loss, model1_accuracy],
[model2_loss, model2_accuracy],
[model3_loss, model3_accuracy]],
['Model 1', 'Model 2', 'Model 3'],
columns = ['Val Loss', 'Val Accuracy']).round(3)
display(df_summary)
| Val Loss | Val Accuracy | |
|---|---|---|
| Model 1 | 0.216 | 0.923 |
| Model 2 | 0.213 | 0.936 |
| Model 3 | 0.194 | 0.949 |
# save the model
model.save('overfitting.h5')
Following the generation of the overfitted model. The manual approach was adopted to get the optimum epoch (cut off) point where there is no any improvement in the model by looking at the validation loss graph as stated below in the code.
This optimum epoch will be used at the end after improving the model by using regularisation and tunning the hyperparameters. Therefore, it will be applied when training all the dataset on unseen data (test data) .
opt_epoch = round(np.argmin(model3_val_loss), 3)
opt_epoch
10
Keras callbackks will handle the optimum epoch by interrupting the training once there is no improving
Keras callbacks will be tested here and compared with the manual technique and the epoch number where the training stopped was used in regularization and hyperparameters tuning.
To be able to compare my results with the change in the number of epoch at each use of callbacks, thus I will fix the this callbacks results. When training all the data set on the test data, the callbacks will be used again.
def model_evaluation_callback(batch_size, epoch_no, train_generator_type,
filter1, filter2, filter3, filter4, neuron1,
active, optimizer, patch, callback):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = build_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0,
callbacks = callback)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
batch_size = 20
epoch_no = 50
# default train generator without data augmentation
train_generator_type = train_generator
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
callback = [keras.callbacks.EarlyStopping(monitor = 'accuracy', patience = 5),
keras.callbacks.ModelCheckpoint(filepath = 'my_model.h5',
monitor = 'val_loss', save_best_only = True)]
model_evaluation_callback(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active, optimizer,
patch, callback)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
callback_train_acc = results_train_acc
callback_val_acc = results_val_acc
callback_train_loss = results_train_loss
callback_val_loss = results_val_loss
callback_epoch_no = len(callback_train_acc)
callback_epoch_no
plot2(callback_epoch_no, callback_train_acc, callback_val_acc, callback_train_loss,
callback_val_loss)
<Figure size 432x288 with 0 Axes>
print('Callback Optimum Epoch:', callback_epoch_no)
Callback Optimum Epoch: 35
# get the maximum accuracy and minimum loss
callback_loss = round(min(callback_val_loss), 3)
callback_accuracy = round(np.max(callback_val_acc), 3)
callback_accuracy
0.947
The optimum epoch when callbacks method was applied is 35
# callback_opt_epoch = round(np.argmin(callback_val_loss), 3)
# print('Callback Optimum Epoch:', callback_opt_epoch)
# print('Manual Optimum Epoch:', opt_epoch)
print('Manual overfitted Model Score:', '%.2f'% (model3_accuracy*100), '%')
print('Callbacks Model Score:', '%.2f'% (callback_accuracy*100), '%')
Manual overfitted Model Score: 94.90 % Callbacks Model Score: 94.70 %
# summarize the results in a dataframe
df_summary = pd.DataFrame([[model3_loss, model3_accuracy],
[callback_loss, callback_accuracy]],
['Manual Model', 'Callback Model'],
columns = ['Val Loss', 'Val Accuracy']).round(3)
display(df_summary)
| Val Loss | Val Accuracy | |
|---|---|---|
| Manual Model | 0.194 | 0.949 |
| Callback Model | 0.234 | 0.947 |
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 32
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
neuron32_train_acc = results_train_acc
neuron32_val_acc = results_val_acc
neuron32_train_loss = results_train_loss
neuron32_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Original 512 Neurons',
y2 = neuron32_val_loss, style_2 = 'r', label_2 = '32 Neurons',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 32 and Original 512 Neurons')
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 64
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
neuron64_train_acc = results_train_acc
neuron64_val_acc = results_val_acc
neuron64_train_loss = results_train_loss
neuron64_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Original 512 Neurons',
y2 = neuron64_val_loss, style_2 = 'r', label_2 = '64 Neurons',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 64 and Original 512 Neurons')
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 128
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
neuron128_train_acc = results_train_acc
neuron128_val_acc = results_val_acc
neuron128_train_loss = results_train_loss
neuron128_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Original 512 Neurons',
y2 = neuron128_val_loss, style_2 = 'r', label_2 = '128 Neurons',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 128 and Original 512 Neurons')
neuron32_acc = max(neuron32_val_acc)
neuron64_acc = max(neuron64_val_acc)
neuron128_acc = max(neuron128_val_acc)
augm_acc = max(augm_val_acc)
neuron32_loss = min(neuron32_val_loss)
neuron64_loss = min(neuron64_val_loss)
neuron128_loss = min(neuron128_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[neuron32_acc, neuron32_loss], [neuron64_acc, neuron64_loss], [neuron128_acc, neuron128_loss], [augm_acc, augm_loss]],
['32 Neurons', '64 Neurons', '128 Neurons', 'Original 512 Neurons'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| 32 Neurons | 0.912 | 0.223 |
| 64 Neurons | 0.913 | 0.231 |
| 128 Neurons | 0.915 | 0.227 |
| Original 512 Neurons | 0.912 | 0.229 |
Best value: 0.223, Using ['32 Neurons']
# plot 4 different models with neuron size values 32, 64, 128 and 256
nEpoch = callback_epoch_no
epochs = range(1, nEpoch + 1)
plt.plot(epochs, neuron32_val_loss, label = '32 Neurons')
plt.plot(epochs, neuron64_val_loss, label = '64 Neurons')
plt.plot(epochs, neuron128_val_loss, label = '128 Neurons')
plt.plot(epochs, augm_val_loss, label = 'Origignal 512 Neurons')
plt.title('Validation Loss Comparisons Across Three Models')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
def smooth_curve(points, factor = 0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(neuron32_val_loss), label = '32 Neurons')
plt.plot(epochs, smooth_curve(neuron64_val_loss), label = '64 Neurons')
plt.plot(epochs, smooth_curve(neuron128_val_loss), label = '128 Neurons')
plt.plot(epochs, smooth_curve(augm_val_loss), label = 'Origignal 512 Neurons')
plt.title('Training and validation loss')
plt.legend()
<matplotlib.legend.Legend at 0x1f2979fec40>
# the below code will be used in data augmentation technique
train_datagen_augm = ImageDataGenerator(
rescale = 1./299,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
fill_mode = 'nearest')
train_generator_augm = train_datagen_augm.flow_from_directory(
train_dir,
target_size = target_size,
batch_size = batch_size,
class_mode='binary',
# the used images are black and white
color_mode = 'grayscale')
Found 2000 images belonging to 2 classes.
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
augm_train_acc = results_train_acc
augm_val_acc = results_val_acc
augm_train_loss = results_train_loss
augm_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, augm_train_acc, augm_val_acc, augm_train_loss, augm_val_loss)
<Figure size 432x288 with 0 Axes>
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = callback_val_loss, style_2 = 'r', label_2 = 'Overfitted Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Overfitted and Augmented Model Validation Loss')
# get the maximum accuracy and minimum loss
augm_loss = round(min(augm_val_loss), 3)
augm_accuracy = round(np.max(augm_val_acc), 3)
# summarize the results in a dataframe
df_summary = pd.DataFrame([[augm_loss, augm_accuracy],
[callback_loss, callback_accuracy]],
['Augmented Model', 'Overfitted Model'],
columns = ['Val Loss', 'Val Accuracy']).round(3)
display(df_summary)
| Val Loss | Val Accuracy | |
|---|---|---|
| Augmented Model | 0.229 | 0.912 |
| Overfitted Model | 0.234 | 0.947 |
# define a model function with units parameters to be easily called every time
def dropout_model(filter1, filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.Dropout(drop_value))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
# define the evaluation functionn using dropout
def dropout_evaluation(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = dropout_model(filter1, filter2, filter3, filter4, neuron1,
active, optimizer, patch, drop_value)
history = model.fit(train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
drop_value = 0.1
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
dropout_evaluation(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
drop01_train_acc = results_train_acc
drop01_val_acc = results_val_acc
drop01_train_loss = results_train_loss
drop01_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, drop01_train_acc, drop01_val_acc, drop01_train_loss, drop01_val_loss)
<Figure size 432x288 with 0 Axes>
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = drop01_val_loss, style_2 = 'r', label_2 = 'dropout 0.1 Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'dropout 0.1 and Augmented Model Validation Loss')
# set the model parameters
drop_value = 0.3
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
dropout_evaluation(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
drop03_train_acc = results_train_acc
drop03_val_acc = results_val_acc
drop03_train_loss = results_train_loss
drop03_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, drop03_train_acc, drop03_val_acc, drop03_train_loss, drop03_val_loss)
<Figure size 432x288 with 0 Axes>
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = drop03_val_loss, style_2 = 'r', label_2 = 'dropout 0.3 Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'dropout 0.3 and Augmented Model Validation Loss')
# set the model parameters
drop_value = 0.5
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
dropout_evaluation(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
drop05_train_acc = results_train_acc
drop05_val_acc = results_val_acc
drop05_train_loss = results_train_loss
drop05_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, drop05_train_acc, drop05_val_acc, drop05_train_loss, drop05_val_loss)
<Figure size 432x288 with 0 Axes>
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = drop05_val_loss, style_2 = 'r', label_2 = 'dropout 0.5 Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'dropout 0.5 and Augmented Model Validation Loss')
# define a model function with units parameters to be easily called every time
def multi_dropout_model(filter1, filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value1, drop_value2, drop_value3,
drop_value4, drop_value5):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(drop_value1))
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(drop_value2))
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(drop_value3))
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(drop_value4))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.Dropout(drop_value5))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
def multi_dropout_evaluation(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value1, drop_value2, drop_value3,
drop_value4, drop_value5):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = multi_dropout_model(filter1, filter2, filter3, filter4, neuron1,
active, optimizer, patch, drop_value1, drop_value2,
drop_value3, drop_value4, drop_value5)
history = model.fit(train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
drop_value1 = 0.1
drop_value2 = 0.2
drop_value3 = 0.2
drop_value4 = 0.3
drop_value5 = 0.5
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
multi_dropout_evaluation(batch_size, epoch_no, train_generator_type, filter1,
filter2, filter3, filter4, neuron1, active,
optimizer, patch, drop_value1, drop_value2,
drop_value3, drop_value4, drop_value5)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
multiDrop_train_acc = results_train_acc
multiDrop_val_acc = results_val_acc
multiDrop_train_loss = results_train_loss
multiDrop_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, multiDrop_train_acc, multiDrop_val_acc, multiDrop_train_loss, multiDrop_val_loss)
<Figure size 432x288 with 0 Axes>
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = multiDrop_val_loss, style_2 = 'r', label_2 = 'Multi-dropout Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Multi-dropout and Augmented Model Validation Loss')
# compare 5 different models with and without dropout
epoch_no = callback_epoch_no
epochs = range(1, epoch_no + 1)
plt.plot(epochs, drop01_val_loss, label = 'Dropout 0.1')
plt.plot(epochs, drop03_val_loss, label = 'Dropout 0.3')
plt.plot(epochs, drop05_val_loss, label = 'Dropout 0.5')
plt.plot(epochs, multiDrop_val_loss, label = 'Multi-dropout')
plt.plot(epochs, augm_val_loss, label = 'Without Dropout')
plt.title('Validation Loss for Four Dropout Models and without Dropout')
plt.xlabel('Epochs')
plt.ylabel('Val Loss')
plt.ylim(bottom = 0, top = 0.8)
plt.legend()
plt.show()
def smooth_curve(points, factor=0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(drop01_val_loss), label='Smoothed Dropout 0.1')
plt.plot(epochs, smooth_curve(drop03_val_loss), label='Smoothed Dropout 0.3')
# plt.plot(epochs, smooth_curve(drop05_val_loss), label='Smoothed Dropout 0.5')
# plt.plot(epochs, smooth_curve(multiDrop_val_loss), label='Smoothed Multi-dropout')
plt.plot(epochs, smooth_curve(augm_val_loss), label='Smoothed Without Dropout')
plt.title('Training and validation loss')
plt.legend()
<matplotlib.legend.Legend at 0x1f296ded670>
drop01_acc = max(drop01_val_acc)
drop03_acc = max(drop03_val_acc)
drop05_acc = max(drop05_val_acc)
multiDrop_acc = max(multiDrop_val_acc)
augm_acc = max(augm_val_acc)
drop01_loss = min(drop01_val_loss)
drop03_loss = min(drop03_val_loss)
drop05_loss = min(drop05_val_loss)
multiDrop_loss = min(multiDrop_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[drop01_acc, drop01_loss], [drop03_acc, drop03_loss],
[drop05_acc, drop05_loss], [multiDrop_acc, multiDrop_loss],
[augm_acc, augm_loss]],
['Dropout 0.1', 'Dropout 0.3', 'Dropout 0.5', 'Multi-dropout', 'No dropout'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| Dropout 0.1 | 0.914 | 0.232 |
| Dropout 0.3 | 0.916 | 0.233 |
| Dropout 0.5 | 0.912 | 0.244 |
| Multi-dropout | 0.911 | 0.239 |
| No dropout | 0.912 | 0.229 |
Best value: 0.229, Using ['No dropout']
although the best value was for the model with 0.1 dropout, but the model without dropout outperformed all the approaches when compared using the line graph.
This implied the augmented model will be utilized in the next step.
# define a model function with units parameters to be easily called every time
def reg_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer,
patch, regularizer):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, kernel_regularizer = regularizer, input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter2, patch, activation = active, kernel_regularizer = regularizer))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter3, patch, activation = active, kernel_regularizer = regularizer))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter4, patch, activation = active, kernel_regularizer = regularizer))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active, kernel_regularizer = regularizer))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
def reg_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2,
filter3, filter4, neuron1, active, optimizer, patch, regularizer):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = reg_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch, regularizer)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
regularizer = regularizers.l1(0.001)
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
reg_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch, regularizer)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
regL1_train_acc = results_train_acc
regL1_val_acc = results_val_acc
regL1_train_loss = results_train_loss
regL1_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, regL1_train_acc, regL1_val_acc, regL1_train_loss, regL1_val_loss)
<Figure size 432x288 with 0 Axes>
# set the model parameters
regularizer = regularizers.l2(0.001)
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
reg_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch, regularizer)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
regL2_train_acc = results_train_acc
regL2_val_acc = results_val_acc
regL2_train_loss = results_train_loss
regL2_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, regL2_train_acc, regL2_val_acc, regL2_train_loss, regL2_val_loss)
<Figure size 432x288 with 0 Axes>
# set the model parameters
regularizer = regularizers.l1_l2(l1 = 0.001, l2 = 0.001)
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
reg_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch, regularizer)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
regL1L2_train_acc = results_train_acc
regL1L2_val_acc = results_val_acc
regL1L2_train_loss = results_train_loss
regL1L2_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, regL1L2_train_acc, regL1L2_val_acc, regL1L2_train_loss, regL1L2_val_loss)
<Figure size 432x288 with 0 Axes>
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = regL2_val_loss, style_2 = 'r', label_2 = 'L2 Regulraizer Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'L2 Regularizer and Augmented Model Validation Loss')
# get the maximum accuracy and minimum loss
regL2_loss = round(min(regL2_val_loss), 3)
regL2_accuracy = round(np.max(regL2_val_acc), 3)
# summarize the results in a dataframe
df_summary = pd.DataFrame([[regL2_loss, regL2_accuracy],
[augm_loss, augm_accuracy]],
['L2 Regulraizer Model', 'Augmented Model'],
columns = ['Val Loss', 'Val Accuracy']).round(3)
display(df_summary)
| Val Loss | Val Accuracy | |
|---|---|---|
| L2 Regulraizer Model | 0.308 | 0.902 |
| Augmented Model | 0.229 | 0.912 |
# define a model function with units parameters to be easily called every time
def reg_model1(filter1, filter2, filter3, filter4, neuron1, active, optimizer,
patch, regularizer):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active, kernel_regularizer = regularizer))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
def reg_evaluation1(batch_size, epoch_no, train_generator_type, filter1, filter2,
filter3, filter4, neuron1, active, optimizer, patch, regularizer):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = reg_model1(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch, regularizer)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
regularizer = regularizers.l1(0.001)
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
reg_evaluation1(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch, regularizer)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
regL11_train_acc = results_train_acc
regL11_val_acc = results_val_acc
regL11_train_loss = results_train_loss
regL11_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, regL11_train_acc, regL11_val_acc, regL11_train_loss, regL11_val_loss)
# set the model parameters
regularizer = regularizers.l2(0.001)
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
reg_evaluation1(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch, regularizer)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
regL22_train_acc = results_train_acc
regL22_val_acc = results_val_acc
regL22_train_loss = results_train_loss
regL22_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, regL22_train_acc, regL22_val_acc, regL22_train_loss, regL22_val_loss)
# set the model parameters
regularizer = regularizers.l1_l2(l1 = 0.001, l2 = 0.001)
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
reg_evaluation1(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch, regularizer)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
regL11L22_train_acc = results_train_acc
regLL1122_val_acc = results_val_acc
regL11L22_train_loss = results_train_loss
regL11L22_val_loss = results_val_loss
# call plot function and set its arguments
plot2(epoch_no, regL11L22_train_acc, regLL1122_val_acc, regL11L22_train_loss, regL11L22_val_loss)
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Augmented Model',
y2 = regL22_val_loss, style_2 = 'r', label_2 = 'L2 Regulraizer (Single Layer) Model',
xlabel = 'Epochs', ylabel = 'Loss', title = 'L2 Regularizer (Single Layer) and Augmented Model Validation Loss')
# get the maximum accuracy and minimum loss
regL22_loss = round(min(regL22_val_loss), 3)
regL22_accuracy = round(np.max(regL22_val_acc), 3)
# summarize the results in a dataframe
df_summary = pd.DataFrame([[regL22_loss, regL22_accuracy],
[augm_loss, augm_accuracy]],
['L2 Regulraizer (Single Layer) Model', 'Augmented Model'],
columns = ['Val Loss', 'Val Accuracy']).round(3)
display(df_summary)
# compare 5 different models with and without dropout
epoch_no = callback_epoch_no
epochs = range(1, epoch_no + 1)
plt.plot(epochs, regL2_val_loss, label = 'L2 Multi-layer')
plt.plot(epochs, regL22_val_loss, label = 'L2 Single Layer')
plt.plot(epochs, augm_val_loss, label = 'Without Regularizers')
plt.title('Validation Loss for Augmented Model with Single and Multi-regularizers')
plt.xlabel('Epochs')
plt.ylabel('Val Loss')
plt.ylim(bottom = 0, top = 0.8)
plt.legend()
plt.show()
# summarize the results in a dataframe
df_summary = pd.DataFrame([[regL22_loss, regL22_accuracy],
[regL2_loss, regL2_accuracy],
[augm_loss, augm_accuracy]],
['L2 Regulraizer Single Layer', 'L2 Regulraizer Multi-layer', 'Augmented Model'],
columns = ['Val Loss', 'Val Accuracy']).round(3)
display(df_summary)
| Val Loss | Val Accuracy | |
|---|---|---|
| L2 Regulraizer Single Layer | 0.323 | 0.899 |
| L2 Regulraizer Multi-layer | 0.308 | 0.902 |
| Augmented Model | 0.229 | 0.912 |
# set the model parameters
batch_size = 64
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
batch64_train_acc = results_train_acc
batch64_val_acc = results_val_acc
batch64_train_loss = results_train_loss
batch64_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Batch Size 20',
y2 = batch64_val_loss, style_2 = 'r', label_2 = 'Batch Size 64',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 20 and 64 Batch Size')
# set the model parameters
batch_size = 128
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
batch128_train_acc = results_train_acc
batch128_val_acc = results_val_acc
batch128_train_loss = results_train_loss
batch128_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Batch Size 20',
y2 = batch128_val_loss, style_2 = 'r', label_2 = 'Batch Size 128',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 20 and 128 Batch Size')
# set the model parameters
batch_size = 256
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
batch256_train_acc = results_train_acc
batch256_val_acc = results_val_acc
batch256_train_loss = results_train_loss
batch256_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Batch Size 20',
y2 = batch256_val_loss, style_2 = 'r', label_2 = 'Batch Size 256',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 20 and 256 Batch Size')
batch64_acc = max(batch64_val_acc)
batch128_acc = max(batch128_val_acc)
batch256_acc = max(batch256_val_acc)
augm_acc = max(augm_val_acc)
batch64_loss = min(batch64_val_loss)
batch128_loss = min(batch128_val_loss)
batch256_loss = min(batch256_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[batch64_acc, batch64_loss], [batch128_acc, batch128_loss], [batch256_acc, batch256_loss], [augm_acc, augm_loss]],
['Batch size 64', 'Batch size 128', 'Batch size 256', 'Original batch size 20'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| Batch size 64 | 0.920 | 0.217 |
| Batch size 128 | 0.914 | 0.243 |
| Batch size 256 | 0.950 | 0.222 |
| Original batch size 20 | 0.912 | 0.229 |
Best value: 0.217, Using ['Batch size 64']
# plot 4 different models with batch size values 64, 128,256 and 20
nEpoch = callback_epoch_no
epochs = range(1, nEpoch + 1)
plt.plot(epochs, batch64_val_loss, 'c', label = 'Batch Size 64') # c is for solid cyan line
plt.plot(epochs, batch128_val_loss, 'b', label ='Batch Size 128') # b is for solid blue line
plt.plot(epochs, batch256_val_loss, 'b--', label = 'Batch Size 256') # b-- is for Dashed blue line
plt.plot(epochs, augm_val_loss, 'r', label = 'With original batch size 20') # r is for solid red line
plt.title('Validation Loss Comparisons Across Four Models')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
def smooth_curve(points, factor = 0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(batch64_val_loss), label = 'Batch Size 64')
plt.plot(epochs, smooth_curve(batch128_val_loss), label = 'Batch Size 128')
plt.plot(epochs, smooth_curve(batch256_val_loss), label = 'Batch Size 256')
plt.plot(epochs, smooth_curve(augm_val_loss), label = 'With original batch size 20')
plt.title('Training and validation loss')
plt.legend()
<matplotlib.legend.Legend at 0x1f297cd5f10>
# set the model parameters
batch_size = 20
filter1 = 8
filter2 = 32
filter3 = 64
filter4 = 64
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
smallFilter_train_acc = results_train_acc
smallFilter_val_acc = results_val_acc
smallFilter_train_loss = results_train_loss
smallFilter_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Original Filter Size',
y2 = smallFilter_val_loss, style_2 = 'r', label_2 = 'Small Filter Size',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With Small and Original Filter Size')
# set the model parameters
batch_size = 20
filter1 = 64
filter2 = 64
filter3 = 128
filter4 = 256
neuron1 = 512
active = 'relu'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
largeFilter_train_acc = results_train_acc
largeFilter_val_acc = results_val_acc
largeFilter_train_loss = results_train_loss
largeFilter_val_loss = results_val_loss
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Original Filter Size',
y2 = largeFilter_val_loss, style_2 = 'r', label_2 = 'Large Filter Size',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With Large and Original Filter Size')
smallFilter_acc = max(smallFilter_val_acc)
largeFilter_acc = max(largeFilter_val_acc)
augm_acc = max(augm_val_acc)
smallFilter_loss = min(smallFilter_val_loss)
largeFilter_loss = min(largeFilter_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[smallFilter_acc, smallFilter_loss], [largeFilter_acc, largeFilter_loss], [augm_acc, augm_loss]],
['Small Filter Size', 'Large Filter Size', 'Original Filter Size'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| Small Filter Size | 0.908 | 0.230 |
| Large Filter Size | 0.918 | 0.230 |
| Original Filter Size | 0.912 | 0.229 |
Best value: 0.229, Using ['Original Filter Size']
# plot 3 different models with small and large filter size
nEpoch = callback_epoch_no
epochs = range(1, nEpoch + 1)
plt.plot(epochs, smallFilter_val_loss, 'c', label = 'Small Filter Size') # c is for solid cyan line
plt.plot(epochs, largeFilter_val_loss, 'b', label = 'Large Filter Size') # b is for solid blue line
plt.plot(epochs, augm_val_loss, 'r', label = 'Original Filter Size') # r is for solid red line
plt.title('Validation Loss Comparisons Across Three Models')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
def smooth_curve(points, factor = 0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(smallFilter_val_loss), label = 'Small Filter Size')
plt.plot(epochs, smooth_curve(largeFilter_val_loss), label = 'Large Filter Size')
plt.plot(epochs, smooth_curve(augm_val_loss), label = 'Original Filter Size')
plt.title('Training and validation loss')
plt.legend()
<matplotlib.legend.Legend at 0x1f29159a880>
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'tanh'
optimizer = 'rmsprop'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
tanh_train_acc = results_train_acc
tanh_val_acc = results_val_acc
tanh_train_loss = results_train_loss
tanh_val_loss = results_val_loss
tanh_acc = max(tanh_val_acc)
augm_acc = max(augm_val_acc)
tanh_loss = min(tanh_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[tanh_acc, tanh_loss], [augm_acc, augm_loss]],
['tanh activation', 'relu activation'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| tanh activation | 0.885 | 0.335 |
| relu activation | 0.912 | 0.229 |
Best value: 0.229, Using ['relu activation']
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'relu activation',
y2 = tanh_val_loss, style_2 = 'r', label_2 = 'tanh activation',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With relu and tanh Activation Function')
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'adam'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
adam_train_acc = results_train_acc
adam_val_acc = results_val_acc
adam_train_loss = results_train_loss
adam_val_loss = results_val_loss
adam_acc = max(adam_val_acc)
augm_acc = max(augm_val_acc)
adam_loss = min(adam_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[adam_acc, adam_loss], [augm_acc, augm_loss]],
['adam optimizer', 'rmsprop optimizer'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| adam optimizer | 0.909 | 0.225 |
| rmsprop optimizer | 0.912 | 0.229 |
Best value: 0.225, Using ['adam optimizer']
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'rmsprop optimizer',
y2 = adam_val_loss, style_2 = 'r', label_2 = 'adam optimizer',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With adam and rmsprop Optimizer')
def smooth_curve(points, factor = 0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(adam_val_loss), label = 'adam optimizer')
plt.plot(epochs, smooth_curve(augm_val_loss), label = 'rmsprop optimizer')
plt.title('Training and validation loss')
plt.legend()
<matplotlib.legend.Legend at 0x1f297f1e250>
adam will be used as it outperformed the rmsprop
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'adam'
patch = (5, 5)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
model_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
patch5_train_acc = results_train_acc
patch5_val_acc = results_val_acc
patch5_train_loss = results_train_loss
patch5_val_loss = results_val_loss
patch5_acc = max(patch5_val_acc)
augm_acc = max(augm_val_acc)
patch5_loss = min(patch5_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[patch5_acc, patch5_loss], [augm_acc, augm_loss]],
['Window Size 5x5', 'Window Size 3x3'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| Window Size 5x5 | 0.905 | 0.218 |
| Window Size 3x3 | 0.912 | 0.229 |
Best value: 0.218, Using ['Window Size 5x5']
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Window Size 3x3',
y2 = patch5_val_loss, style_2 = 'r', label_2 = 'Window Size 5x5',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With 3x3 and 5x5 Filter Window Size')
def smooth_curve(points, factor = 0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(patch5_val_loss), label = 'Window Size 5x5')
plt.plot(epochs, smooth_curve(augm_val_loss), label = 'Window Size 3x3')
plt.title('Training and validation loss')
plt.legend()
<matplotlib.legend.Legend at 0x1f2982d7580>
# define a model function with units parameters to be easily called every time
def addLayers_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch, neuron2):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1)))
model.add(layers.Conv2D(filter1, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.Dense(neuron2, activation = active))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
def addLayers_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch, neuron2):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = addLayers_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch, neuron2)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
neuron2 = 64
active = 'relu'
optimizer = 'adam'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
addLayers_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch, neuron2)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
addLayers_train_acc = results_train_acc
addLayers_val_acc = results_val_acc
addLayers_train_loss = results_train_loss
addLayers_val_loss = results_val_loss
addLayers_acc = max(addLayers_val_acc)
augm_acc = max(augm_val_acc)
addLayers_loss = min(addLayers_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[addLayers_acc, addLayers_loss], [augm_acc, augm_loss]],
['Added Layers', 'Original Layers'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| Added Layers | 0.882 | 0.323 |
| Original Layers | 0.912 | 0.229 |
Best value: 0.229, Using ['Original Layers']
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Original Layers',
y2 = addLayers_val_loss, style_2 = 'r', label_2 = 'Added Layers',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With and Without Adding Layers')
Padding technique will be used so the spatial dimensions for the output feature map will be the same as for the input. That will be done by adding more rows and columns.
def padd_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
model = Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1), padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter2, patch, activation = active, padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter3, patch, activation = active, padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(filter4, patch, activation = active, padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
def padd_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = padd_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
history = model.fit(train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'adam'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
padd_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
padd_train_acc = results_train_acc
padd_val_acc = results_val_acc
padd_train_loss = results_train_loss
padd_val_loss = results_val_loss
padd_acc = max(padd_val_acc)
augm_acc = max(augm_val_acc)
padd_loss = min(padd_val_loss)
augm_loss = min(augm_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[padd_acc, padd_loss], [augm_acc, augm_loss]],
['With Padding', 'Without Padding'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| With Padding | 0.912 | 0.213 |
| Without Padding | 0.912 | 0.229 |
Best value: 0.213, Using ['With Padding']
plot1(epochs = callback_epoch_no, y1 = augm_val_loss, style_1 = 'b', label_1 = 'Without Padding',
y2 = padd_val_loss, style_2 = 'r', label_2 = 'With Padding',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With and Without Padding')
Padding will be used in the next models as it shows a slight improvement when applied
# define a model function with units parameters to be easily called every time
def depthSep_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
model = models.Sequential()
model.add(layers.SeparableConv2D(filter1, patch, activation = active, input_shape = (150, 150, 1), padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.SeparableConv2D(filter2, patch, activation = active, padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.SeparableConv2D(filter3, patch, activation = active, padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.SeparableConv2D(filter4, patch, activation = active, padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
def depthSep_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = depthSep_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'adam'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
depthSep_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
depthSep_train_acc = results_train_acc
depthSep_val_acc = results_val_acc
depthSep_train_loss = results_train_loss
depthSep_val_loss = results_val_loss
depthSep_acc = max(depthSep_val_acc)
padd_acc = max(padd_val_acc)
depthSep_loss = min(depthSep_val_loss)
padd_loss = min(padd_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[depthSep_acc, depthSep_loss], [padd_acc, padd_loss]],
['With depthSep', 'Without depthSep'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| With depthSep | 0.500 | 0.693 |
| Without depthSep | 0.912 | 0.213 |
Best value: 0.213, Using ['Without depthSep']
plot1(epochs = callback_epoch_no, y1 = padd_val_loss, style_1 = 'b', label_1 = 'Without Depthwise',
y2 = depthSep_val_loss, style_2 = 'r', label_2 = 'With Depthwise',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With and Without Depthwise')
# define a model function with units parameters to be easily called every time
def batchNorm_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
model = models.Sequential()
model.add(layers.Conv2D(filter1, patch, activation = active, input_shape = (150, 150, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.BatchNormalization())
model.add(layers.Conv2D(filter2, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.BatchNormalization())
model.add(layers.Conv2D(filter3, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.BatchNormalization())
model.add(layers.Conv2D(filter4, patch, activation = active))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.BatchNormalization())
model.add(layers.Flatten())
model.add(layers.Dense(neuron1, activation = active))
model.add(layers.BatchNormalization())
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = 'accuracy')
return model
# model.summary()
def batchNorm_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch):
# those was set to be global so they can be called outside the function
global results_val_loss
global results_val_acc
global results_train_loss
global results_train_acc
model = batchNorm_model(filter1, filter2, filter3, filter4, neuron1, active, optimizer, patch)
history = model.fit(
train_generator_type,
steps_per_epoch = len(train_generator_type.filenames) // batch_size,
epochs = epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
results_val_loss = history.history['val_loss']
results_val_acc = history.history['val_accuracy']
results_train_loss = history.history['loss']
results_train_acc = history.history['accuracy']
# set the model parameters
batch_size = 20
filter1 = 32
filter2 = 64
filter3 = 128
filter4 = 128
neuron1 = 512
active = 'relu'
optimizer = 'adam'
patch = (3, 3)
epoch_no = callback_epoch_no
train_generator_type = train_generator_augm # train generator with data augmentation
batchNorm_evaluation(batch_size, epoch_no, train_generator_type, filter1, filter2, filter3,
filter4, neuron1, active, optimizer, patch)
# rename the variables to be able to compare the values for different model
# and avoid overwriting the function values
batchNorm_train_acc = results_train_acc
batchNorm_val_acc = results_val_acc
batchNorm_train_loss = results_train_loss
batchNorm_val_loss = results_val_loss
batchNorm_acc = max(batchNorm_val_acc)
padd_acc = max(padd_val_acc)
batchNorm_loss = min(batchNorm_val_loss)
padd_loss = max(padd_val_loss)
# summarizing the results in a data frame
df_summary = pd.DataFrame([[batchNorm_acc, batchNorm_loss], [padd_acc, padd_loss]],
['With Batch Normalization', 'Without Batch Normalization'],
columns=['Validation Accuracy', 'Validation Loss']).round(3)
display(df_summary)
value = min(df_summary['Validation Loss'])
item = df_summary[['Validation Loss']].idxmin().tolist()
print('Best value: {}, Using {}'.format(value, item))
| Validation Accuracy | Validation Loss | |
|---|---|---|
| With Batch Normalization | 0.915 | 0.217 |
| Without Batch Normalization | 0.912 | 0.405 |
Best value: 0.217, Using ['With Batch Normalization']
plot1(epochs = callback_epoch_no, y1 = padd_val_loss, style_1 = 'b', label_1 = 'Without Batch Normalization',
y2 = batchNorm_val_loss, style_2 = 'r', label_2 = 'With Batch Normalization',
xlabel = 'Epochs', ylabel = 'Loss', title = 'Validation Loss Model With and Without Deptthwise')
The below model was reused from github account and it is referenced in the reference section
def MiniVGGNet(width, height, depth, classes):
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
if K.image_data_format()=="channels_first":
inputShape = (depth, height, width)
chanDim = 1
model.add(layers.Conv2D(32, (3,3), padding = "same", input_shape = inputShape))
model.add(layers.Activation("relu"))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.Conv2D(32, (3,3), padding = "same"))
model.add(layers.Activation("relu"))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(layers.Dropout(0.25))
model.add(layers.Conv2D(64, (3,3), padding = "same"))
model.add(layers.Activation("relu"))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.Conv2D(64, (3,3), padding = "same"))
model.add(layers.Activation("relu"))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(layers.Dropout(0.25))
model.add(layers.Flatten())
model.add(layers.Dense(512))
model.add(layers.Activation("relu"))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(classes))
# model.add(layers.Activation("softmax"))
model.add(layers.Activation("sigmoid"))
return model
# from pyimagesearch.minivggnet import MiniVGGNet
# conv_base = MiniVGGNet.build(width = 150, height = 150, depth = 1, classes = 2)
conv_base = MiniVGGNet(width = 150, height = 150, depth = 1, classes = 2)
model = Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(512, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = "binary_crossentropy", optimizer = 'adam', metrics = ["accuracy"])
model.summary()
Model: "sequential_28"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
sequential_27 (Sequential) (None, 2) 44928738
flatten_19 (Flatten) (None, 2) 0
dense_48 (Dense) (None, 512) 1536
dense_49 (Dense) (None, 1) 513
=================================================================
Total params: 44,930,787
Trainable params: 44,929,379
Non-trainable params: 1,408
_________________________________________________________________
conv_base.trainable = False
print(len(model.trainable_weights))
4
history = model.fit(
train_generator,
steps_per_epoch = len(train_generator.filenames) // batch_size,
epochs = callback_epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 1)
# loss
plt.plot(r.history['loss'], label='train loss')
plt.plot(r.history['val_loss'], label='val loss')
plt.legend()
plt.show()
plt.savefig('LossVal_loss')
# accuracies
plt.plot(r.history['accuracy'], label='train acc')
plt.plot(r.history['val_accuracy'], label='val acc')
plt.legend()
plt.show()
plt.savefig('AccVal_acc')
import tensorflow as tf
from keras.models import load_model
model.save('model_vgg19.h5')
# prediction on new image
from keras.models import load_model
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
model = load_model('model_vgg19.h5')
img = image.load_img('/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal_small/validation/covid/COVID-1049.png',
target_size=(150, 150))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
img_data = preprocess_input(x)
classes = model.predict(img_data)
WARNING:tensorflow:Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer.
WARNING:tensorflow:Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer.
array([[0.85964197, 0.140358 ]], dtype=float32)
import os
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
base_dir = '/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal_small'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
def extract_features(directory, sample_count):
features = np.zeros(shape=(sample_count, 4, 4, 512))
labels = np.zeros(shape=(sample_count))
generator = datagen.flow_from_directory(
directory,
target_size = (150, 150),
batch_size = batch_size,
class_mode = 'binary',)
# color_mode = 'grayscale')
i = 0
for inputs_batch, labels_batch in generator:
features_batch = conv_base.predict(inputs_batch)
features[i * batch_size : (i + 1) * batch_size] = features_batch
labels[i * batch_size : (i + 1) * batch_size] = labels_batch
i += 1
if i * batch_size >= sample_count:
# Note that since generators yield data indefinitely in a loop,
# we must `break` after every image has been seen once.
break
return features, labels
train_features, train_labels = extract_features(train_dir, 2000)
validation_features, validation_labels = extract_features(validation_dir, 1000)
test_features, test_labels = extract_features(test_dir, 1000)
Found 2000 images belonging to 2 classes. Found 1000 images belonging to 2 classes. Found 1000 images belonging to 2 classes.
The extracted features are currently of shape (samples, 4, 4, 512)
Flatten to (samples, 8192) ready for input to a dense classifier
train_features = np.reshape(train_features, (2000, 4 * 4 * 512))
validation_features = np.reshape(validation_features, (1000, 4 * 4 * 512))
test_features = np.reshape(test_features, (1000, 4 * 4 * 512))
Define the densely-connected classifier (with dropout for regularisation)
and train it on the recorded data and labels:
model = Sequential()
model.add(layers.Dense(512, activation = 'relu', input_dim = 4 * 4 * 512))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
history = model.fit(train_features, train_labels,
epochs = callback_epoch_no,
batch_size = 20,
validation_data = (validation_features, validation_labels),
verbose = 0)
acc_ftr = history.history['accuracy']
val_acc_ftr = history.history['val_accuracy']
loss_ftr = history.history['loss']
val_loss_ftr = history.history['val_loss']
# call plot function and set its arguments
plot2(epoch_no, acc_ftr, val_acc_ftr, loss_ftr, val_loss_ftr)
<Figure size 432x288 with 0 Axes>
print('Validation Accuracy Without Data Augmentation:', round(max(val_acc_ftr), 2))
print('Validation Loss Without Data Augmentation:', round(max(val_loss_ftr), 2))
Validation Accuracy Without Data Augmentation: 0.96 Validation Loss Without Data Augmentation: 0.22
model = Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(512, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.summary()
Model: "sequential_25"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 4, 4, 512) 14714688
flatten_16 (Flatten) (None, 8192) 0
dense_42 (Dense) (None, 512) 4194816
dense_43 (Dense) (None, 1) 513
=================================================================
Total params: 18,910,017
Trainable params: 18,910,017
Non-trainable params: 0
_________________________________________________________________
print('Trainable parameter tensors before freezing the conv base:', len(model.trainable_weights))
Trainable parameter tensors before freezing the conv base: 30
conv_base.trainable = False
print('Trainable parameter tensors after freezing the conv base:', len(model.trainable_weights))
Trainable parameter tensors after freezing the conv base: 4
history = model.fit(
train_generator_augm, # use of data augmentaion
steps_per_epoch = len(train_generator.filenames) // batch_size,
epochs = callback_epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
# # acc_aug = history.history['accuracy']
# # val_acc_aug = history.history['val_accuracy']
# # loss_aug = history.history['loss']
# # val_loss_aug = history.history['val_loss']
# call plot function and set its arguments
plot2(epoch_no, acc_aug, val_acc_aug, loss_aug, val_loss_aug)
print('Validation Accuracy With Data Augmentation', max(val_acc_aug))
print('Validation Loss With Data Augmentation', max(val_loss_aug))
Due to the lack of GPU, it was very difficult to run the above part
conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
if layer.name == 'block5_conv1':
set_trainable = True
if set_trainable:
layer.trainable = True
else:
layer.trainable = False
model.compile(loss = 'binary_crossentropy', optimizer= 'adam', metrics = ['accuracy'])
history = model.fit(
train_generator,
steps_per_epoch = len(train_generator.filenames) // batch_size,
epochs = callback_epoch_no,
validation_data = validation_generator,
validation_steps = len(validation_generator.filenames) // batch_size,
verbose = 0)
acc_fine = history.history['accuracy']
val_acc_fine = history.history['val_accuracy']
loss_fine = history.history['loss']
val_loss_fine = history.history['val_loss']
# call plot function and set its arguments
plot2(epoch_no, acc_fine, val_acc_fine, loss_fine, val_loss_fine)
<Figure size 432x288 with 0 Axes>
Section Four:¶
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation = 'relu', input_shape = (150, 150, 1), padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = 'accuracy')
plot_model(model, show_shapes=True, to_file = 'model.png')
history = model.fit(train_generator,
steps_per_epoch = len(train_generator.filenames) // 20,
epochs = callback_epoch_no,
validation_data = test_generator,
validation_steps = len(test_generator.filenames) // 20,
verbose = 0)
test_loss, test_acc = model.evaluate(test_generator, steps = len(test_generator.filenames) // 20, verbose = 1)
print('Test Accuracy:' , '%.2f'% (test_acc*100), '%')
print('Test Loss:' ,'%.3f'% test_loss)
Test Accuracy: 93.21 % Test Loss: 0.236
# summarizing the results in a data frame
df_summary = pd.DataFrame([[baseline_acc, baseline_loss], [padd_acc, padd_loss], [test_acc, test_loss]],
['Baseline Model' , 'Trained Model', 'Final Model on Unseen Data'],
columns=['Accuracy', 'Loss']).round(3)
display(df_summary)
| Accuracy | Loss | |
|---|---|---|
| Baseline Model | 0.515 | 1.597 |
| Trained Model | 0.912 | 0.405 |
| Final Model on Unseen Data | 0.932 | 0.236 |
Section Five:¶
xray_covid = '/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal_small/train/covid/COVID-11.png'
xray_normal = '/Users/maiad/OneDrive/Desktop/NN CW 2/Dataset/Covid_Normal_small/train/normal/Normal-11.png'
# preprocess the image into a 4D tensor
from tensorflow.keras.preprocessing import image
import numpy as np
covid = image.load_img(xray_covid, target_size = (150, 150), color_mode = "grayscale")
covid_tensor = image.img_to_array(covid)
covid_tensor = np.expand_dims(covid_tensor, axis = 0)
covid_tensor /= 299.
normal = image.load_img(xray_normal, target_size = (150, 150), color_mode = "grayscale")
normal_tensor = image.img_to_array(normal)
normal_tensor = np.expand_dims(normal_tensor, axis = 0)
normal_tensor /= 299.
# Its shape is (1, 150, 150, 1)
print(covid_tensor.shape)
print(normal_tensor.shape)
(1, 150, 150, 1) (1, 150, 150, 1)
plt.imshow(covid_tensor[0])
plt.title('Covid')
plt.show()
plt.imshow(normal_tensor[0])
plt.title('Normal')
plt.show()
Following the book "Deep learning with Python". some visualization was attempting to find out how the fillters work such as detecting the image edges
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model = models.Model(inputs = model.input, outputs = layer_outputs)
# This will return a list of 5 Numpy arrays:
# one array per layer activation
activations = activation_model.predict(normal_tensor)
# # The activation of the first convolution layer has shape
first_layer_activation = activations[0]
print(first_layer_activation.shape)
# visualizing 8th and 15th channels
plt.matshow(first_layer_activation[0, :, :, 7])
plt.matshow(first_layer_activation[0, :, :, 14])
plt.show()
Now extract and plot every channel in each of our 8 activation maps - stack the results in one big image tensor, with channels stacked side by side.
# These are the names of the layers, so can have them as part of our plot
layer_names = []
for layer in model.layers[:8]:
layer_names.append(layer.name)
images_per_row = 16
# Now let's display our feature maps
for layer_name, layer_activation in zip(layer_names, activations):
# This is the number of features in the feature map
n_features = layer_activation.shape[-1]
# The feature map has shape (1, size, size, n_features)
size = layer_activation.shape[1]
# We will tile the activation channels in this matrix
n_cols = n_features // images_per_row
display_grid = np.zeros((size * n_cols, images_per_row * size))
# We'll tile each filter into this big horizontal grid
for col in range(n_cols):
for row in range(images_per_row):
channel_image = layer_activation[0,
:, :,
col * images_per_row + row]
# Post-process the feature to make it visually palatable
channel_image -= channel_image.mean()
if(channel_image.std() > 0):
channel_image /= channel_image.std()
channel_image *= 64
channel_image += 128
channel_image = np.clip(channel_image, 0, 255).astype('uint8')
display_grid[col * size : (col + 1) * size,
row * size : (row + 1) * size] = channel_image
# Display the grid
scale = 1. / size
plt.figure(figsize=(scale * display_grid.shape[1],
scale * display_grid.shape[0]))
plt.title(layer_name)
plt.grid(False)
plt.imshow(display_grid, aspect='auto', cmap='viridis')
plt.show()
--------------------------------------------------------------------------- MemoryError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_8424/1084523632.py in <module> 16 # We will tile the activation channels in this matrix 17 n_cols = n_features // images_per_row ---> 18 display_grid = np.zeros((size * n_cols, images_per_row * size)) 19 20 # We'll tile each filter into this big horizontal grid MemoryError: Unable to allocate 519. TiB for an array with shape (107495424, 663552) and data type float64
from tensorflow.keras.applications import VGG16
from tensorflow.keras import backend as K
tf.compat.v1.disable_eager_execution()
model = VGG16(weights='imagenet',
include_top=False,
input_shape=(150, 150, 3))
layer_name = 'block3_conv1'
filter_index = 0
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
# The call to `gradients` returns a list of tensors (of size 1 in this case)
# hence we only keep the first element -- which is a tensor.
grads = K.gradients(loss, model.input)[0]
# We add 1e-5 before dividing so as to avoid accidentally dividing by 0.
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
iterate = K.function([model.input], [loss, grads])
# Let's test it:
loss_value, grads_value = iterate([np.zeros((1, 150, 150, 3))])
# We start from a gray image with some noise
input_img_data = np.random.random((1, 150, 150, 3)) * 20 + 128.
# Run gradient ascent for 40 steps
step = 1. # this is the magnitude of each gradient update
for i in range(40):
# Compute the loss value and gradient value
loss_value, grads_value = iterate([input_img_data])
# Here we adjust the input image in the direction that maximizes the loss
input_img_data += grads_value * step
def deprocess_image(x):
# normalize tensor: center on 0., ensure std is 0.1
x -= x.mean()
x /= (x.std() + 1e-5)
x *= 0.1
# clip to [0, 1]
x += 0.5
x = np.clip(x, 0, 1)
# convert to RGB array
# x *= 255
# x = np.clip(x, 0, 255).astype('uint8')
return x
def generate_pattern(layer_name, filter_index, size=150):
# Build a loss function that maximizes the activation
# of the nth filter of the layer considered.
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
# Compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, model.input)[0]
# Normalization trick: we normalize the gradient
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
# This function returns the loss and grads given the input picture
iterate = K.function([model.input], [loss, grads])
# We start from a gray image with some noise
input_img_data = np.random.random((1, size, size, 3)) * 20 + 128.
# Run gradient ascent for 40 steps
step = 1.
for i in range(40):
loss_value, grads_value = iterate([input_img_data])
input_img_data += grads_value * step
img = input_img_data[0]
return deprocess_image(img)
Let's visualise filter 0 in layer block3_conv1
plt.imshow(generate_pattern('block3_conv1', 0))
plt.show()
for layer_name in ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1']:
size = 64
margin = 5
# This a empty (black) image where we will store our results.
results = np.zeros((8 * size + 7 * margin, 8 * size + 7 * margin, 3))
for i in range(8): # iterate over the rows of our results grid
for j in range(8): # iterate over the columns of our results grid
# Generate the pattern for filter `i + (j * 8)` in `layer_name`
filter_img = generate_pattern(layer_name, i + (j * 8), size=size)
# Put the result in the square `(i, j)` of the results grid
horizontal_start = i * size + i * margin
horizontal_end = horizontal_start + size
vertical_start = j * size + j * margin
vertical_end = vertical_start + size
results[horizontal_start: horizontal_end, vertical_start: vertical_end, :] = filter_img
# Display the results grid
plt.figure(figsize=(20, 20))
plt.imshow(results)
plt.show()